Skip to content

Refactor test_parquet.py to use check_round_trip at module level #19332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 23, 2018

Conversation

maximveksler
Copy link
Contributor

@maximveksler maximveksler commented Jan 21, 2018

Refactoring and unification of testing approach in test_parquet.py module.

Iteration upon work that was done for #19135 (comment)

  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • tests added / passed

@jreback jreback added Testing pandas testing functions or related to the test suite IO Parquet parquet, feather labels Jan 21, 2018

def test_options_py(df_compat, pa):
# use the set option
if engine:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a case where engine is not required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mostly when testing for "defaults" handling (i.e. default engine, or engine being configured via option_context. Please see here https://github.com/pandas-dev/pandas/pull/19332/files#diff-539eb41a29c68cc27818db4b480c2fd6R163

def test_invalid_engine(df_compat):
def check_round_trip(df, engine=None, path=None,
write_kwargs=None, read_kwargs=None,
expected=None, check_names=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a proper doc-string here

df.to_parquet(path)
if path is None:
with tm.ensure_clean() as path:
df.to_parquet(path, **write_kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make both of these into a loop with a kwarg in the signature repeat=2

for _ in range(repeat):
    .....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks...

Python is not native for me, I was looking for a nicer way to do that 👍

then compares the 2 resulting DataFrames to verify full
cycle is successful.

:param df: Dataframe to be serialized to disk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use the numpy-doc format (just at any other doc-string)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include Parameters, Raises sections

then compares the 2 resulting DataFrames to verify full
cycle is successful.

:param df: Dataframe to be serialized to disk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include Parameters, Raises sections

@codecov
Copy link

codecov bot commented Jan 22, 2018

Codecov Report

Merging #19332 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19332      +/-   ##
==========================================
- Coverage   91.57%   91.54%   -0.03%     
==========================================
  Files         150      150              
  Lines       48700    48700              
==========================================
- Hits        44595    44583      -12     
- Misses       4105     4117      +12
Flag Coverage Δ
#multiple 89.91% <ø> (-0.03%) ⬇️
#single 41.72% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/plotting/_converter.py 65.22% <0%> (-1.74%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ba65ca...f687979. Read the comment docs.


write_kwargs = write_kwargs or {'compression': None}
read_kwargs = read_kwargs or {}
expected = expected or df
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if expected is None:
    expected = df

write_kwargs: dict of str:str, optional
read_kwargs: dict of str:str, optional
expected: DataFrame, optional
Expected deserialization result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say otherwise will be equal to the input

@jreback jreback added this to the 0.23.0 milestone Jan 23, 2018
@jreback jreback merged commit 5fdb9c0 into pandas-dev:master Jan 23, 2018
@jreback
Copy link
Contributor

jreback commented Jan 23, 2018

thanks @maximveksler

@maximveksler maximveksler deleted the test_parquet_refactor branch May 16, 2018 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Parquet parquet, feather Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants